Rank | Count | Beginning |
---|---|---|
84885 | 3703 | V |
351 | 2324 | A |
39350 | 2228 | Na |
54940 | 1650 | Podle |
79699 | 1240 | To |
23385 | 1111 | Je |
1765 | 893 | Ale |
54426 | 832 | Po |
57392 | 793 | Pokud |
29662 | 741 | Když |
85895 | 654 | Ve |
48908 | 650 | O |
21716 | 635 | Jak |
20243 | 623 | I |
94534 | 565 | Za |
94528 | 530 | Z |
69568 | 517 | S |
14203 | 503 | Do |
11327 | 468 | Co |
64777 | 457 | Pro |
28421 | 439 | K |
63414 | 396 | Při |
49977 | 394 | Od |
84383 | 385 | Už |
78325 | 381 | Ten |
24287 | 377 | Jeho |
61876 | 310 | Před |
25609 | 297 | Jenže |
75923 | 291 | Ta |
97613 | 280 | Ze |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV